skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Sotudian, Shahabeddin"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Srinivasan, Kathiravan (Ed.)
    Despite their satisfactory performance, most existing listwise Learning-To-Rank (LTR) models do not consider the crucial issue of robustness. A data set can be contaminated in various ways, including human error in labeling or annotation, distributional data shift, and malicious adversaries who wish to degrade the algorithm’s performance. It has been shown that Distributionally Robust Optimization (DRO) is resilient against various types of noise and perturbations. To fill this gap, we introduce a new listwise LTR model called Distributionally Robust Multi-output Regression Ranking (DRMRR) . Different from existing methods, the scoring function of DRMRR was designed as a multivariate mapping from a feature vector to a vector of deviation scores, which captures local context information and cross-document interactions. In this way, we are able to incorporate the LTR metrics into our model. DRMRR uses a Wasserstein DRO framework to minimize a multi-output loss function under the most adverse distributions in the neighborhood of the empirical data distribution defined by a Wasserstein ball. We present a compact and computationally solvable reformulation of the min-max formulation of DRMRR. Our experiments were conducted on two real-world applications: medical document retrieval and drug response prediction, showing that DRMRR notably outperforms state-of-the-art LTR models. We also conducted an extensive analysis to examine the resilience of DRMRR against various types of noise: Gaussian noise, adversarial perturbations, and label poisoning. Accordingly, DRMRR is not only able to achieve significantly better performance than other baselines, but it can maintain a relatively stable performance as more noise is added to the data. 
    more » « less
  2. Abstract Background Predictive models utilizing social determinants of health (SDH), demographic data, and local weather data were trained to predict missed imaging appointments (MIA) among breast imaging patients at the Boston Medical Center (BMC). Patients were characterized by many different variables, including social needs, demographics, imaging utilization, appointment features, and weather conditions on the date of the appointment. Methods This HIPAA compliant retrospective cohort study was IRB approved. Informed consent was waived. After data preprocessing steps, the dataset contained 9,970 patients and 36,606 appointments from 1/1/2015 to 12/31/2019. We identified 57 potentially impactful variables used in the initial prediction model and assessed each patient for MIA. We then developed a parsimonious model via recursive feature elimination, which identified the 25 most predictive variables. We utilized linear and non-linear models including support vector machines (SVM), logistic regression (LR), and random forest (RF) to predict MIA and compared their performance. Results The highest-performing full model is the nonlinear RF, achieving the highest Area Under the ROC Curve (AUC) of 76% and average F1 score of 85%. Models limited to the most predictive variables were able to attain AUC and F1 scores comparable to models with all variables included. The variables most predictive of missed appointments included timing, prior appointment history, referral department of origin, and socioeconomic factors such as household income and access to caregiving services. Conclusions Prediction of MIA with the data available is inherently limited by the complex, multifactorial nature of MIA. However, the algorithms presented achieved acceptable performance and demonstrated that socioeconomic factors were useful predictors of MIA. In contrast with non-modifiable demographic factors, we can address SDH to decrease the incidence of MIA. 
    more » « less
  3. null (Ed.)
  4. Abstract ObjectiveTo develop predictive models of coronavirus disease 2019 (COVID-19) outcomes, elucidate the influence of socioeconomic factors, and assess algorithmic racial fairness using a racially diverse patient population with high social needs. Materials and MethodsData included 7,102 patients with positive (RT-PCR) severe acute respiratory syndrome coronavirus 2 test at a safety-net system in Massachusetts. Linear and nonlinear classification methods were applied. A score based on a recurrent neural network and a transformer architecture was developed to capture the dynamic evolution of vital signs. Combined with patient characteristics, clinical variables, and hospital occupancy measures, this dynamic vital score was used to train predictive models. ResultsHospitalizations can be predicted with an area under the receiver-operating characteristic curve (AUC) of 92% using symptoms, hospital occupancy, and patient characteristics, including social determinants of health. Parsimonious models to predict intensive care, mechanical ventilation, and mortality that used the most recent labs and vitals exhibited AUCs of 92.7%, 91.2%, and 94%, respectively. Early predictive models, using labs and vital signs closer to admission had AUCs of 81.1%, 84.9%, and 92%, respectively. DiscussionThe most accurate models exhibit racial bias, being more likely to falsely predict that Black patients will be hospitalized. Models that are only based on the dynamic vital score exhibited accuracies close to the best parsimonious models, although the latter also used laboratories. ConclusionsThis large study demonstrates that COVID-19 severity may accurately be predicted using a score that accounts for the dynamic evolution of vital signs. Further, race, social determinants of health, and hospital occupancy play an important role. 
    more » « less
  5. null (Ed.)
  6. null (Ed.)
    The new coronavirus (now named SARS-CoV-2) causing the disease pandemic in 2019 (COVID-19), has so far infected over 35 million people worldwide and killed more than 1 million. Most people with COVID-19 have no symptoms or only mild symptoms. But some become seriously ill and need hospitalization. The sickest are admitted to an Intensive Care Unit (ICU) and may need mechanical ventilation to help them breath. Being able to predict which patients with COVID-19 will become severely ill could help hospitals around the world manage the huge influx of patients caused by the pandemic and save lives. Now, Hao, Sotudian, Wang, Xu et al. show that computer models using artificial intelligence technology can help predict which COVID-19 patients will be hospitalized, admitted to the ICU, or need mechanical ventilation. Using data of 2,566 COVID-19 patients from five Massachusetts hospitals, Hao et al. created three separate models that can predict hospitalization, ICU admission, and the need for mechanical ventilation with more than 86% accuracy, based on patient characteristics, clinical symptoms, laboratory results and chest x-rays. Hao et al. found that the patients’ vital signs, age, obesity, difficulty breathing, and underlying diseases like diabetes, were the strongest predictors of the need for hospitalization. Being male, having diabetes, cloudy chest x-rays, and certain laboratory results were the most important risk factors for intensive care treatment and mechanical ventilation. Laboratory results suggesting tissue damage, severe inflammation or oxygen deprivation in the body's tissues were important warning signs of severe disease. The results provide a more detailed picture of the patients who are likely to suffer from severe forms of COVID-19. Using the predictive models may help physicians identify patients who appear okay but need closer monitoring and more aggressive treatment. The models may also help policy makers decide who needs workplace accommodations such as being allowed to work from home, which individuals may benefit from more frequent testing, and who should be prioritized for vaccination when a vaccine becomes available. 
    more » « less